Making Heads or Tails of Combined Landmark Configurations in GM data

Michael L. Collyer, Mark A. Davis, Dean C. Adams

6/8/2020

Imagine this scenario…

Digitizing landmarks comprising salamander heads and tails, on whole organisms.

Levis et al. (2016) Biological Journal of the Linnean Society, 118(3),569–581.

Imagine this scenario…

Produces a GPA result that looks like this:

Imagine this scenario…

But separate GPAs on heads and tails gives better results in terms of variation around individual landmarks!

Is there a way to combine Procrustes residuals from separate configurations for morphological analyses?

When one might wish to combine landmark configurations

Adams (1999) introduced methods for (1) fixing the articulation angle (2D configurations) between separate configurations or (2) appending subsets of data. Vidal-García et al. (2018) extended the fixed-angle concept to 3D data (multiple points and planar rotations).

Adams (1999) Evolutionary Ecology Research, 1, 959–970; Vidal-García et al. (2018) Ecology and Evolution, 8(9), 4669-4675.

When one might wish to combine landmark configurations

Non-articulated structures can be combined with the “separate subsets” method (Adams 1999). When combined, configurations should be scaled to relative sizes (GPA will render all configurations to unit size).

Davis et al. (2016) offered a simple way to do that with this equation:

\[CS^{'}_i=\frac{CS_{i}}{\sum_{i=1}^{g}CS_{i}},\]

where \(CS^{'}_i\) is the relative centroid size of configuration \(i\), which is a scalar multiplied by the coordinates when appending configurations. If one configuration is large and one is small, they will remain large and small in combination. (This is done per specimen.)

Note that combined configurations are not actually unit size, as Davis et al. (2016) suggested, but are consistently scaled across specimens.

Adams (1999) Evolutionary Ecology Research, 1, 959–970; Davis et al. (2016) PLoS ONE, 11(1), e0211753.

When one might wish to combine landmark configurations

Profico et al. (2020) demonstrated that by combining multiple 2D configurations, it was possible to produce similar PC dispersion patterns to 3D configurations, which might be beneficial if 3D data collection is not easy or possible.

They also found issues with the Davis et al. (2016) approach and offered a new solution for relativizing centroid sizes (more details soon).

Davis et al. (2016) PLoS ONE, 11(1), e0211753; Profico et al. (2020) Hystrix,the Italian Journal of Mammalogy, 30, 157–165.

How one should combine landmark configurations

Collyer et al. (2020) proposed a general formula for obtaining relative centroid sizes

\[CS^{'}_i=\frac{w_iCS_{i}}{\sqrt{\sum_{i=1}^{g} \left(w_iCS_{i}\right)^2}},\]

where the denominator is the pooled centroid size from multiple configurations and \(w_i\) are a priori weights. Relative centroid sizes are then used to scale Procrustes residuals, \(\mathbf{Z}_i\), i.e.,

\[\mathbf{Z} = \begin{pmatrix} CS^{'}_1\mathbf{Z}_1\\ CS^{'}_2\mathbf{Z}_2\\ \vdots\\ CS^{'}_g\mathbf{Z}_g\\ \end{pmatrix}.\]

\(\mathbf{Z}\) is a matrix of combined coordinates, centered at \(0,0\) (2D) or \(0,0,0\) (3D) with a (pooled) centroid size equal to \(1\).

If all \(w_i\) are equal (to \(1\)), we can call this relative centroid sizes via standard centroid size (\(SCS\)).

If \(w_i\) are not all equal, we can call this relative centroid sizes via weighted centroid size.

Collyer et al. (2020) Evolutionary Biology, in press.

How one should combine landmark configurations

Collyer et al. (2020) proposed a general formula for obtaining relative centroid sizes

\[CS^{'}_i=\frac{w_iCS_{i}}{\sqrt{\sum_{i=1}^{g} \left(w_iCS_{i}\right)^2}},\]

Profico et al. (2020) found the unweighted approach of Davis et al. (2016) – and by extension, when all \(w_i\) above are equal – had some flaws and offered a solution that

\[w_i = \left(p_{i}k \right)^{-1/2},\]

for the \(p\) landmarks in \(k\) dimensions. (\(k\) is not needed for comparing multiple centroid sizes in the same dimension.) These weights normalize centroid size (Dryden and Mardia 2016). Whereas centroid size finds the sum of squared distances of landmarks to their centroid, normalized centroid size finds the mean of squared distances.

Collyer et al. (2020) Evolutionary Biology, in press; Profico et al. (2020) Hystrix,the Italian Journal of Mammalogy, 30, 157–165; Dryden & Mardia (2016). Statistical shape analysis: With applications in R. Wiley.

How one should combine landmark configurations

Collyer et al. (2020) proposed a general formula for obtaining relative centroid sizes

\[CS^{'}_i=\frac{w_iCS_{i}}{\sqrt{\sum_{i=1}^{g} \left(w_iCS_{i}\right)^2}},\]

Collyer et al. (2020) Evolutionary Biology, in press; Profico et al. (2020) Hystrix,the Italian Journal of Mammalogy, 30, 157–165.

Why normalize centroid size to find relative sizes of configurations?

As Profico et al. (2020) illustrated, circles with the same radius and surface area have different \(CS^{'}\) when using \(SCS\) but not when using \(NCS\) to relativize.

Notes

Profico et al. (2020) Hystrix,the Italian Journal of Mammalogy, 30, 157–165.

Why normalize centroid size to find relative sizes of configurations?

This gives the impression that \(CS^{'}\) via \(NCS\) is independent of landmark density.

What about circles of different size?

What about configurations with interior and exterior landmarks (concentric circles)?

Why normalize centroid size to find relative sizes of configurations?

Why normalize centroid size to find relative sizes of configurations?

Notes

Normalizing centroid size is not a universal solution

I.e., \(NCS\) will tend to make smaller objects larger in relative size, if landmark density is the same.

Normalizing centroid size is not a universal solution

I.e., \(NCS\) is landmark density-dependent and seems only viable when comparing uniform landmark distributions on the exterior of objects.

We discuss non-uniformly distributed landmark distributions in Collyer et al. (2020), which only exacerbate issues.

Collyer et al. (2020) Evolutionary Biology, in press

Normalizing centroid size is not a universal solution

Empirical example

As a reminder

Normalizing centroid size is not a universal solution

To summarize (thus far)

To emphasize

Should combined configurations be considered shapes? Should they be aligned?

\[\mathbf{Z} = \begin{pmatrix} CS^{'}_1\mathbf{Z}_1\\ CS^{'}_2\mathbf{Z}_2\\ \vdots\\ CS^{'}_g\mathbf{Z}_g\\ \end{pmatrix}.\]

\(\mathbf{Z}\) is a matrix of combined coordinates, centered at \(0,0\) (2D) or \(0,0,0\) (3D) with a (pooled) centroid size equal to \(1\).

This sounds a lot like \(\mathbf{Z}\) is a new set of Procrustes residuals.

Should combined configurations be considered shapes? Should they be aligned?

Collyer et al. (2020) goes into more detail, but the simple answer is No.

Combining landmark configurations introduces landmark covariances that have no anatomical meaning. (Even anatomically, if configurations correspond to objects that can change with respect to spatial relationships, like heads and tails, covariances between landmarks in separate configurations do not make sense.)

Combined configurations are composites of shapes, perhaps integrated, which might be used as morphological variables for statistical analyses. But visualization of shape differences (e.g., TPS warp grids) should not be performed on combined configurations.

Collyer et al. (2020) Evolutionary Biology, in press

Should combined configurations be considered shapes? Should they be aligned?

Should combined configurations be considered shapes? Should they be aligned?

Should combined configurations be considered shapes? Should they be aligned?

Collyer et al. (2020) goes into more detail, but the simple answer is No.

Combining landmark configurations introduces landmark covariances that have no anatomical meaning. (Even anatomically, if configurations correspond to objects that can change with respect to spatial relationships, like heads and tails, covariances between landmarks in separate configurations do not make sense.)

Combined configurations are composites of shapes, perhaps integrated, which might be used as morphological variables for statistical analyses. But visualization of shape differences (e.g., TPS warp grids) should not be performed on combined configurations. It is possible to map multiple shape changes to one point in a combined PC plot.

Collyer et al. (2020) Evolutionary Biology, in press

An example in geomorph, using combine.subsets

library(geomorph)
## Loading required package: RRPP
## Loading required package: rgl
data(larvalMorph) 
attributes(larvalMorph)
## $names
## [1] "headcoords"   "tailcoords"   "head.sliders" "tail.sliders" "treatment"   
## [6] "family"
head.gpa <- gpagen(larvalMorph$headcoords, curves = larvalMorph$head.sliders,
print.progress = FALSE)
tail.gpa <- gpagen(larvalMorph$tailcoords, curves = larvalMorph$tail.sliders,
print.progress = FALSE)

Note we can make full organism coordinates and sliders, and perform GPA

sliders <- rbind(larvalMorph$head.sliders, 26 + larvalMorph$tail.sliders)

coords <- simplify2array(
  lapply(1:length(larvalMorph$treatment), function(j){
    rbind(larvalMorph$headcoords[,,j], larvalMorph$tailcoords[,,j])
  })
)

all.gpa <- gpagen(coords, curves = sliders, print.progress = FALSE)

An example in geomorph, using combine.subsets

plot(all.gpa)

This should look familiar

An example in geomorph, using combine.subsets

par(mfrow = c(1, 2))

plot(tail.gpa)
plot(head.gpa)

par(mfrow = c(1, 1))

This should look familiar

An example in geomorph, using combine.subsets

Combine with \(SCS\)

comb.lm <- combine.subsets(head = head.gpa, tail = tail.gpa, gpa = TRUE)
summary(comb.lm)
## 
## A total of 2 subsets were combined
## 
##                                  head       tail
## Number of points in subset 26.0000000 64.0000000
## Mean centroid size          5.7101870 24.2377131
## Mean relative size          0.1906707  0.8093293
par(mfrow = c(1,2))
plotAllSpecimens(comb.lm$coords)
plot(comb.lm$coords[,,1], pch = 21, bg = c(rep(1,26), rep(2,64)), asp = 1)

par(mfrow = c(1,2))

An example in geomorph, using combine.subsets

Combine with \(NCS\)

comb.lm <- combine.subsets(head = head.gpa, tail = tail.gpa, gpa = TRUE,
                           norm.CS = TRUE)
summary(comb.lm)
## 
## A total of 2 subsets were combined
## 
##                                     head       tail
## Number of points in subset    26.0000000 64.0000000
## Mean normalized centroid size  1.1198598  3.0297141
## Mean relative size             0.2698734  0.7301266
par(mfrow = c(1,2))
plotAllSpecimens(comb.lm$coords)
plot(comb.lm$coords[,,1], pch = 21, bg = c(rep(1,26), rep(2,64)), asp = 1)

par(mfrow = c(1,2))

An example in geomorph, using combine.subsets

Combine with user-defined weights

comb.lm <- combine.subsets(head = head.gpa, 
tail = tail.gpa, gpa = TRUE, norm.CS = FALSE, weights = c(0.3, 0.7))
summary(comb.lm)
## 
## A total of 2 subsets were combined
## 
##                                    head      tail
## Number of points in subset  26.00000000 64.000000
## Mean weighted centroid size  1.71305611 16.966399
## Mean relative size           0.09170803  0.908292
par(mfrow = c(1,2))
plotAllSpecimens(comb.lm$coords)
plot(comb.lm$coords[,,1], pch = 21, bg = c(rep(1,26), rep(2,64)), asp = 1)

par(mfrow = c(1,1))

An example in geomorph, using combine.subsets

Ignore relativization, all together

comb.lm <- combine.subsets(head = head.gpa$coords, 
tail = tail.gpa$coords, gpa = FALSE, CS.sets = NULL)
## 
## No CS sets input.  Final configurations will not be scaled
par(mfrow = c(1,2))
plotAllSpecimens(comb.lm$coords)
plot(comb.lm$coords[,,1], pch = 21, bg = c(rep(1,26), rep(2,64)), asp = 1)

par(mfrow = c(1,1))

More on combine.subsets

Parting Thoughts

Acknowledgments

Collyer et al. 2020. Making Heads or Tails of Combined Landmark Configurations in Geometric Morphometric Data. Evolutionary Biology, In Press.

https://www.researchgate.net/publication/341946958_Making_Heads_or_Tails_of_Combined_Landmark_Configurations_in_Geometric_Morphometric_Data

Co-authors: Mark Davis · Dean Adams

Funding: NSF Grants DEB-1737895 and DBI-1902694 (to MC) and DEB-1556379 and DBI-1902511 (to DA)